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(54) Reclamation of processor resources in a data processor 

(57) In a microprocessor, an apparatus is included 
for coordinating the use of physical registers in the 
microprocessor. Upon receiving an instruction, the 
coordination apparatus extracts source and destination 
logical registers from the instruction. For the destination 
logical register, the apparatus assigns a physical 
address to correspond to the logical register. In so 
doing, the apparatus stores the former relationship 
between the logical register and another physical regis- 
ter. Storing this former relationship allows the apparatus 
to backstep to a particular instruction when an execu- 
tion exception is encountered. Also, the apparatus 
checks the instruction to determine whether it is a spec- 
ulative branch instruction. If so, then the apparatus cre- 
ates a checkpoint by storing selected state information. 
This checkpoint provides a reference point to which the 
processor may later backup if it is determined that a ; 
speculated branch was incorrectly predicted. Overall, 
the apparatus coordinates the use of physical registers 
in the processor in such a way that: (1) logical/physical 
register relationships are easily changeable; and (2) 
backup and backstep procedures are accommodated. 
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Description 

Related Applications 

The subject matter of this application is related to 
the subject matter of the following applications: 
European patent application 96101842.1 ; 
European patent application 96101839.7; 
European patent application 96101840.5; 
European patent application 96101841.3; 
the European patent application entitled "METHOD 
AND APPARATUS FOR ACCELERATING CONTROL 
TRANSFER RETURNS"; 

the European patent application entitled "METHOD 
AND APPARATUS FOR SELECTING INSTRUCTIONS 
FROM ONES READY TO EXECUTE"; 
the European patent application entitled "METHODS 
FOR UPDATING FETCH PROGRAM COUNTER"; 
the European patent application entitled "METHOD 
AND APPARATUS FOR RAPID EXECUTION OF CON- 
TROL TRANSFER INSTRUCTIONS" 
the European patent application entitled "ECC PRO- 
TECTED MEMORY ORGANIZATION WITH PIPE- 
LINED READ-MODIFY-WRITE ACCESSES": 
the European patent application entitled "METHOD 
AND APPARATUS FOR PRIORITIZING AND HAN- 
DLING ERRORS IN A COMPUTER SYSTEM"; 
the European patent application entitled "HARDWARE 
SUPPORT FOR FAST SOFTWARE EMULATION OF 
UNIMPLEMENTED INSTRUCTIONS"; and 
the European patent application entitled "METHOD 
AND APPARATUS FOR GENERATING A ZERO BIT 
STATUS FLAG IN A MICROPROCESSOR", 
the latter eight of which are filed simultaneously with this 
application. 

Field of the Invention 

This invention relates generally to microprocessors, 
and more particularly to a method and apparatus for 
efficiently coordinating the use of physical registers in a 
microprocessor during instruction execution. 

Background of the Invention 

Recent improvements in data processing have 
included the processing of instructions in parallel. In 
order to implement parallel processing of instructions, 
various techniques have been implemented including 
register renaming, speculative execution, and out-of- 
order execution. 

Register renaming is a technique utilized by proc- 
essors in which the processor remaps the same archi- 
tectural register to a different physical register in order 
to avoid stalling instruction issues. This technique 
requires the maintenance of a greater number of physi- 
cal registers than would otherwise be warranted archi- 
tecturally. The processor must, therefore, continuously 
monitor the status of the physical register resources 



including how many of the physical registers are in use 
at a given moment, to which architectural registers are 
the various physical registers mapped, and which of the 
physical registers are available for use. In order to 

s accomplish this task, the processor maintains a list of 
physical registers ("freelist") that are not in use. When 
an instruction is issued, the processor remaps the archi- 
tectural destination register to one of the registers on 
the freelist. The selected physical register is then 

10 removed from the freelist. Whenever the renamed phys- 
ical registers are no longer needed, these physical reg- 
isters are marked free by adding them to the freelist 
pool. Those physical register resources, which are 
missing from the freelist, are considered to be "in use" 

r5 or otherwise unavailable to the processor for further 
mapping. Where the resultant register of an instruction 
is to be used as a source (architectural) register for a 
sequentially following instruction, the source register is 
mapped to a renamed physical register from the freelist. 

20 In order for the processor to use the correctly associ- 
ated physical register, a rename map is continuously 
maintained by the processor which identifies which 
architectural registers are mapped to which physical 
registers. All sequentially subsequent instructions that 

25 refer to a sequentially earlier instruction's architectural 
register should use the renamed physical register. 

Speculative execution is a technique utilized by 
processors in which the processor predicts a next 
branch target address for a next instruction where data 

30 is unavailable to evaluate a condition for a conditional 
branch. By using speculative execution, processor 
delays which would otherwise occur in waiting for the 
data needed to evaluate the condition, are avoided. 
Whenever there is a misprediction, the processor must 

35 be returned to the state which existed prior to the 
branching step and the correct branch must be identi- 
fied in order to proceed with execution of the correct 
sequence of instructions. In order to recover the state of 
the processor after a misprediction, one technique that 

40 has been utilized is called checkpointing wherein the 
machine state is stored (or checkpointed) after every 
speculative instruction. 

Out-of-order execution is a technique utilized by 
processors in which the processor includes multiple 

45 execution units which are issued instructions sequen- 
tially but which may complete execution of instructions 
non-sequentially due to varying execution times of the 
instructions. 

In processors where the architectural registers are 
so renamed, provisions must exist for efficiently restoring 
the correct state of the architectural registers when the 
processor does a backup to a checkpoint due to a mis- 
predicted branch instruction, or, when a sequentially 
later instruction modifies the architectural register 
55 before the detection of an execution exception due to a 
sequentially earlier instruction. 

to elaborate, many instruction sequences in a com- 
puter prbigram are independent of other instruction 
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sequences. For example, in the following instruction 
stream, 

1 Load LR1 , Mem1 load logical register LR1 with 

data from Mem1 s 

2 Inc LR1 ;lncrement logical register LR1 

3 Store LR1 , Mem1 ;Store contents of logical regis- 

ter LR1 into Mem1 

4 Load LR2, Mem2;Load logical register LR2 with 

data from Mem2 10 

5 Inc LR2 increment logical register LR2 

6 Store LR2, Mem2;Store contents of logical regis- 

ter LR2 into Mem2, 

the second three instructions (instruction 4-6) are inde- is 
pendent of the first three instructions (instruction 1-3). 
That is, instructions 4-6 do not depend on the results of 
instructions 1-3 to execute properly. Thus, in this exam- 
ple, instructions 1-3 and instructions 4-6 could be exe- 
cuted in parallel to optimize performance, ft is this 20 
concept of executing instructions in parallel and out of 
sequence, which underlies the executing methods of 
superscalar processors. 

To provide for parallel execution capability, super- 
scalar processors typically comprise more physical reg- 25 
isters than there are logical registers. Logical registers 
are registers, such as LR1 and LR2 in the example 
above, which are referenced in the instructions. Physi- 
cal registers are the registers within the processor 
which are actually used for storing data during process- 30 
ing. The extra physical registers are needed in super- 
scalar processors in order to accommodate parallel 
processing. One consequence of having more physical 
registers than logical registers is that there is no one-to- 
one correspondence between the logical and physical 35 
registers. Rather, a physical register may correspond to 
logical register LR1 for one set of instructions and then 
correspond to logical register LR2 for another set of 
instructions. Because the relationship between logical 
and physical registers can change, a mapping or coor- 40 
dination function is performed in order to keep track of 
the changing relationships. In order to. optimize perform- . 
ance in a superscalar processor, an efficient, method v 
and apparatus for coordinating the use of the? physical 
registers is needed. , ... , . : , 45 

Summary of the Invention 

In accordance with the present invention, a data . 
processor which executes multiple instructions in paral-j so 
lei uses renaming of physical registers in order to . , 
reduce delays in processing and uses checkpoints to , 
retain the processor state and restore register 
resources on the occurrence of an instruction exception v , 
or branch misprediction. Additionally, the processor , A ss 
includes a resource reclamation random access mem:, 
ory (RAM) associating each issued instruction with ah; /. 
issue serial number (ISN) and storing information about 
the instruction including whether register renaming, , , 



occurs, and what associated architectural and physical 
registers are utilized; and a resource reclamation 
pointer (RRP) for identifying instructions in program 
order that are prepared for retirement after successful 
execution and identifying any associated physical regis- 
ters which may be added to the freelist. 

The data processor includes a register file unit, a 
register reclaim file unit; a freelist unit, and a control 
unit, which coordinates the use of physical registers in a 
microprocessor in such a manner that allows for easy 
physical register assignment and convenient state res- 
toration. The overall operation of the apparatus of the 
present invention is controlled by the control unit. In 
operation, the control unit receives an instruction, and in 
response, extracts a destination logical register value 
from the instruction. Then, the control unit obtains a free 
physical register identifier from the freelist unit which 
points to a particular physical register within the proces- 
sor. Once the destination logical register value and the 
free physical register identifier are obtained, the control 
unit stores the logical register value into the register file 
unit in association with the obtained physical register 
identifier. By so doing, a relationship is established 
between the logical register value and the physical reg- 
ister identifier which can be used to map the logical reg- 
ister value to the particular physical register. A physical 
register is thus assigned or mapped to a logical register 
value. 

In addition to establishing logical/physical register 
relationships, the apparatus of the present invention 
also performs at least two other important functions. 
First, if the instruction received is a branch instruction, 
then the apparatus creates a "checkpoint" which cap- 
tures the current state of the processor. This checkpoint 
provides a reference point in time to which the proces- 
sor can backtrack or backup if it is later determined that 
an incorrect branch was chosen. By creating these 
checkpoints, the apparatus of the present invention sup- 
ports speculative execution. As a second important 
function, when the apparatus of the present invention 
assigns a new physical register to a logical register 
value, the old relationship between the logical register 
value and another physical register is saved in the reg- 
ister reclaim file unit. This is done so that if an execution 
exception (such as a divide-by-zero) is encountered 
which requires the execution of a trap handler, the old 
relationship can be easily restored. By so doing, the 
apparatus of the present invention provides the proces- 
sor with a means for conveniently backstepping to a 
particular instruction before accessing a trap handier. 
Overall, the present invention provides a convenient 
and efficient method and apparatus for coordinating the 
use of physical registers in a microprocessor. 

Brief Description of the Drawings 

Figure 1 is a block diagram representation of a 
processor wherein the present invention is imple- 
mented... 
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Figure 2 is a detailed block diagram representation 
of the register management unit 16 of the present inven- 
tion. 

Figs. 3a-3c are block diagram representations of 
the register file unit 30, the register reclaim file unit 32, 
and the freelist unit 34 of the register management unit 
16 in several sample states. 

Figure 4 is an operational flow diagram of the con- 
trol unit 36 of the control unit 36 of the register manage- 
ment unit 16. 

Figure 5 is a more detailed flow diagram for the 
backup process 134 of Figure 4. 

Figure 6 is a more detailed flow diagram for the 
backstep process 138 of Figure 4. 

Detailed Description of the Preferred Embodiment 

With reference to Figure 1 , there is shown a block 
diagram representation of a processor 10 wherein the 
present invention is implemented. As shown, processor 
10 preferably comprises an instruction issue unit 12, a 
sequencer 14 coupled to the instruction issue unit 12, a 
register management unit 16 coupled to the sequencer 
14, reservation stations 18 coupled to both the register 
management unit 16 and the instruction issue unit 12, 
and an execution unit 20 coupled to the reservation sta- 
tions 18 and the register management unit 16. In the 
preferred embodiment, processor 10 is a superscalar 
processor capable of executing multiple instructions in 
parallel. 

In processor 10, the instruction issue unit 12 
receives from an external source (not shown) a series of 
instructions, and stores these instructions for later exe- 
cution. The instruction issue unit 12 receives a clock 
signal as input and in each clock cycle, unit 12 issues 
one or more instructions for execution. The issued 
instructions are sent to both the sequencer 14 and the 
reservation stations 18. The sequencer 14, in response 
to the issued instructions, assigns a sequence number 
(Sn) to each of the instructions. As will be explained 
later, these sequence numbers are used by the register 
management unit 16 to keep track of the instructions. 
Once the sequence numbers are assigned to the 
instructions, the sequencer 14 forwards the instructions 
to the register management unit 16. 

Operationally, computer instructions reference logi- 
cal registers. These logical registers (which will be 
referred to herein as LR) may be source registers which 
contain certain data needed to execute trie instruction, 
or these registers may be destination registers to which 
data resulting from the execution of the instruction is to 
be written. For example, in the instruction 

LR1 LR2 --> LR3, 

which divides the contents of LR1 by the contents of 
LR2 and writes the result into LR3, LR1 and LR2 are the 
source registers while LR3 is the destination register. 



Logical registers do not point to any physical loca- 
tion at which a physical register resides. To get from a 
logical register value to a physical register, a translation 
or mapping process is carried out. This mapping func- 

5 tion is one of the functions performed by the register 
management unit 16. If there were a one-to-one corre- 
spondence between logical and physical registers, and 
if their relationships were constant then the mapping 
function would be a simple one. All that would be 

io needed is a static translation table. However, as noted 
previously, there is no constant one-to-one correspond- 
ence in superscalar processors. Instead, relationships 
between logical and physical registers are constantly 
changing. Hence, the register management unit 16 

is needs to be able to handle the changing relationships. 

As an additional complication, superscalar proces- 
sors engage in speculative execution. That is, whenever 
a branch instruction is encountered, a guess or predic- 
tion is made as to which branch will actually be taken. 

20 Once the prediction is made, the instructions following 
the predicted branch are executed. If later it is deter- 
mined that the wrong branch was predicted, then the 
processor 10 will need to backtrack or "backup" to the 
branch instruction, choose the proper branch, and then 

25 execute instructions following that branch. In order to 
backup properly, the processor 10 will need to be 
restored to the state just prior to the branching opera- 
tion. This involves, among other operations, restoring 
the relationships between the logical and physical regis- 

30 ters. To accommodate this backup procedure, the regis- 
ter management unit 16 preferably coordinates the use 
of the physical registers in such a way that relationship 
restoration is possible. 

As yet a further complication, superscalar proces- 

35 sors process instructions in parallel and out of 
sequence. As a result, if an execution exception (such 
as a divide-by-zero) is encountered which requires the 
execution of a trap handler, it becomes necessary to 
"backstep" to the instruction invoking the exception 

40 before executing the trap handler. Like the backup pro- 
cedure, this "backstep" procedure involves, among 
other operations, restoring the relationships between 
trie logical and physical registers. Unlike the backup 
procedure, however, backstepping involves an instruc- 
ts tion other than a branch instruction. This requires differ- 
ent handling, as will be explained below. The 
management unit 16 preferably manages the use of the 
physical registers in such a way that allows for this back- 
stepping procedure. With the above background infor- 

so matibn in mind, the register management unit 16 will 
now be described in greater detail. 

Figure 2 shows a detailed block diagram of the pre- 
ferred embodiment of the register management unit 16. 
As shown, management unit 16 preferably comprises a 

55 register file unit 30, a register reclaim file unit 32, a freel- 
ist unit 34, and a control unit 36 for controlling the overall 
operation of the management unit 16. 

The register file unit 30 is the component (or set of 
components) in management unit 16 which is mainly 
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responsible for mapping logical registers to physical 
registers, and for storing data. Register file unit 30 pref- 
erably comprises a number of different components, 
including a read-only-memory (ROM) 40, a content 
addressable memory (CAM) 42, a data random access 5 
memory (data RAM) 44, an address valid (AV) RAM 46. 
and at least one, and preferably a plurality of checkpoint 
RAM's 48 1 -48 rv m register file unit 30, the ROM 40 is 
used to store all of the physical register address identifi- 
ers PR 0 -PR n for all of the physical registers in the proc- w 
essor 10. Each of the physical address identifiers PRo- 
PR n points to an entry in the data RAM 44 wherein data 
corresponding to the physical register identifier is 
stored, along with a data valid (DV) bit. Preferably, there 
is a one-to-one correspondence between the entries of 75 
the ROM 40 and the entries of the data RAM 44. Since 
the physical register identifiers PRo-PR n are stored in 
ROM 40, they cannot be altered by writing operations; 
hence, the physical register identifiers PR 0 -PR n remain 
constant for the life of the processor 10. 20 

The CAM 42 in register file unit 30 is used to store 
logical, register values corresponding to the physical 
register identifiers. Unlike ROM 40, the contents of the 
CAM 42 can be and are frequently changed: Preferably, 
there is a one-to-one correspondence between the 25 
entries of the CAM 42. the AV RAM 46, and the ROM 
40, as shown in Figure 2. Together, ROM 40. CAM 42, 
and AV RAM 46 provide a mechanism for quickly and 
easily assigning a physical register to a logical register. 
To illustrate, suppose that it is desirable to assign phys- 30 
ical register PR 0 to logical register LR^ To establish 
such a relationship, all that needs to be done is to store 
the logical register value LB^ into the entry of the CAM 
42 corresponding to the physical register identifier PR 0 , 
and to set the AV bit in AV RAM 46 corresponding to the 35 
physical register identifier PR 0 . Once that is done, the 
next time the logical register LR 1 is asserted in an 
instruction, the CAM 42 will search for LH^ and will sig- 
nal a "hit" in the entry corresponding to the physical reg- 
ister identifier PRq. This, in turn, will cause the identifier 40 
PRo to be outputted from the ROM 40, which in turn, will 
cause the corresponding entry in the data RAM 44 to be " 
accessed. Thus, as shown by this example, storjng a 
logical register value in CAM 42 establishes a direct 1 
relationship between a logical address and a physical'/^ 
address. To alter this relationship and to establish a new . 
one, all that is needed is to store the logical register 
value LR.| into a different entry in the CAM 42. 

Register file unit 30 preferably further comprises 
checkpoint RAM's 48 r 48 n for temporarily storing the so 
contents of the AV RAM 46. As will be explained in 
greater detail in a later section, whenever a speculative 
branch instruction is encountered, a "checkpoint" is 
made by storing the contents of the AV RAM 46 into one 
of the checkpoint RAM's 48 r 48 n . By storing the oph- ss 
tents of the AV RAM 46, the state of the system prior to 
branching is captured. This checkpoint provides a refer- . 
ence point to backup to in case an incorrect branch is ' 
predicted^ Register file unit 30 preferably comprises a "[ ; 



plurality of checkpoint RAM's 48 in order to allow a plu- 
rality of checkpoints to be made. This in turn allows the 
processor 10 to execute through multiple levels of 
nested branches. 

The register reclaim file unit 32 is the component in 
management unit 16 which makes it possible for the unit 
16 to backstep to an instruction causing an execution 
exception. Preferably, unit 32 comprises a resource 
reclaim RAM 50 which has a plurality of entries. Each of 
the entries in the reclaim RAM 50 is preferably indexed 
with an issue sequence (or serial) number correspond- 
ing to a particular issued instruction. Associated with 
each entry is a rename valid bit portion, a first portion 
which stores a renamed architectural (logical) register 
identifier, and a second portion which stores an old 
physical register identifier. The rename valid bit portion 
stores a 'zero', if the particular instruction requires no 
renaming, and stores a 'one', if the particular instruction 
requires renaming of an architectural register. The old 
physical register identifier corresponds to the renamed 
architectural register identifier immediately before issue 
of the particular instruction. In effect, each entry of the 
reclaim RAM 50 stores a state of the system just prior to 
the execution of a particular instruction. Hence, the 
information stored in reclaim RAM 50 may be utilized to 
restore the system to the state that it had immediately 
prior to a given instruction. This aspect of the manage- 
ment unit 16 makes backstepping possible 

Register management unit 16 preferably further 
comprises a freelist unit 34 which stores the physical 
register identifiers which are free to be assigned to logi- 
cal registers. The freelist unit 34 preferably comprises a 
freelist RAM 62 which stores free physical address 
identifiers, a head register 64 which stores a "head" 
pointer, a tail register 66 which stores a lair pointer, 
and at least one, and preferably a plurality of checkpoint 
registers 68^680- The freelist RAM 62 is preferably 
operated as a FIFO storage. The head pointer points to 
the next physical register identifier in the RAM 62 which 
should be assigned to a logical register, while the tail 
pointer points to the last physical register identifier 
stored in the RAM 62. Each time a free physical register 
identifier is assigned to a logical register, the head 
pointer is incremented, and each time a free physical 
register identifier is added to the RAM 62, the tail pointer 
is incremented. Preferably, the freelist RAM 62 has at 
least (P-L) entries where P is the number of physical 
register identifiers and L is the number of logical register 
values. 

With regard to the checkpoint registers 
these are used to store the value of the head pointer 
whenever a speculative branch instruction is encoun- 
tered. By storing the value of the head pointer, the state 
of the freelist unit 34 is saved. This in turn allows the 
state of the freelist RAM 62 to be restored if necessary. 
The checkpoint registers 48 provide support for the 
backup procedure. Preferably, freelist unit 34 comprises 
a plurality of checkpoint registers 48 to allow multiple 
checkpoints to be made. Multiple checkpoints enable 
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the processor 10 to execute through multiple levels of 
nested branches. 

The register management unit 16 preferably further 
comprises a control unit 36 for coordinating the opera- 
tion of the other components 30, 32, 34, and for control- s 
ling the overall operation of the management unit 16. In 
the preferred embodiment, the control unit 36 is imple- 
mented in hardware as a state machine. The control 
unit 36 will be described in greater detail as operation of 
the processor 10 is described. 

Referring once again to Figure 1, the reservation 
stations 18 of the processor 10 are responsible for two 
primary functions. First, the reservation stations 18 cap- 
ture all of the source data needed to execute an instruc- 
tion. This data is received from the register 
management unit 16. Second, reservation stations 18 
schedule instructions for execution. Instructions may be 
selected for execution if the DV bits from register file unit 
30 are asserted for all sources and if no older instruc- 
tions stored in the reservation stations are eligible. The 
instructions, once selected, are passed on to the execu- 
tion unit 20 for execution. Overall, stations 18 are 
responsible for the smooth execution of instructions. All 
of the relevant elements of the processor 10 have now 
been discussed. The overall operation of the processor 
10 will now be described. 

Before the processor 10 is used in regular opera- 
tion, it first needs to be initialized. Several steps are car- 
ried out in the initialization process. First, each and 
every logical register value is stored into one of the 
entries of the CAM 42. The AV bits corresponding to the 
CAM entries in which logical register values are stored 
are set This ensures that before operation, all of the 
logical register values are validly mapped to a physical 
register. The particular mapping (i.e. which logical regis- 
ter value is mapped to which physical register) is arbi- 
trary. No single logical register value is mapped to more 
than one physical register, however. 

Once all of the logical register values are stored into 
the CAM 42, the free physical register are known. 
Accordingly, the physical register identifiers correspond- 
ing to the free registers are stored into the freelist RAM 
62. This provides an indication as to which physical reg- 
isters are free and may be assigned. 

As an example, suppose that the management unit 
16 is initialized as shown in Figure 3a. More specifically, 
suppose that physical register PR 0 has been assigned 
to logical register LR 0 , PR 1 has been assigned to LR-, , 
PR 2 has been assigned to LR 2 , PR3 has been assigned 
to LR 3 , and PR 4 has been assigned to LR 4 . After 
assignment, data (Data 0 -Data 4 ) may be written into the 
corresponding locations in the data RAM 44, along with 
asserted data valid bits. As shown in Figure 3a, physical 
registers PR5-PR9 have not been assigned to any logi- 
cal register value. Thus, they are considered ^ree", 
which means that they may be assigned logical register 
values in upcoming operations. Thus, the physical reg- 
ister identifiers PR5-PR9 associated with the free physi- 
cal registers are stored in the freelist unit 34. The head 



pointer in the freelist unit is pointing to PR 5t thereby indi- 
cating that PR 5 will be the next physical register 
assigned to a logical register value. Currently, no infor- 
mation is stored in the register reclaim file unit 32. 

Now, suppose that the instruction issue unit 12 
issues the following instruction: 

LR 0 LR 1 -> LR 3 . 

This instruction, when executed, will cause the data in 
logical registers LR 0 to be divided by the data in logical 
register LR 1t and will cause the result to be stored into 
logical register LR 3 . For this instruction, logical registers 
LR 0 and LR 1 are the source logical registers from which 
data will be drawn, while logical register LR 3 is the des- 
tination logical register. Once issued, the instruction is 
passed on to the sequencer 14, where a sequence 
number is assigned to the instruction. Suppose that 
sequence number 0 is assigned. Thus, the instruction 
becomes: 

Instruction #0: LR 0 LR 1 --> LR 3 . 

Once a sequence number is assigned, the instruction is 
passed on to the control unit 36 of the register manage- 
ment unit 16 for processing. 

In the management unit 16, it is the control unit 36 
which receives and processes new instructions. An 
operational flow diagram for the control unit 36 is shown 
in Figure 4. Preferably, control unit 36 begins operation 
by receiving 100 the instruction and then determining 
102 whether the instruction is a speculative branch 
instruction. If the instruction is a speculative branch 
instruction, then control unit 36 preferably creates a 
"checkpoint" to provide a reference point to which the 
processor 10 can backup in case the wrong branch of 
the branch instruction is predicted. In creating a check- 
point, two operations are taken. First, the contents of 
the AV RAM 46 are stored 104 into one of the check- 
point RAM's 48 1 -48 n . This operation preserves for later 
reference all of the current relationships between the 
logical registers and the physical registers. Second, the 
contents of the head counter 64 in the freelist unit 34 are 
stored 106 into one of the checkpoint registers 68 1 -68 n . 
By storing these two sets of information, the current 
state of the processor 10 is recorded. This information 
may be retrieved at a later time to restore the state of 
the processor 1 0 to that just prior to the execution of the 
speculative branch instruction. As will be explained in 
greater detail later, this aspect of the management unit 
16 enables the processor 10 to carry out the backup 
procedure of the present invention. 

In the present example, the instruction (LR 0 ^- LR 1 - 
-> LR 3 ) is not a speculative branch instruction; thus, 
control unit 36 bypasses steps 104 and 106 and pro- 
ceeds to step 108 to extract a destination logical regis- 
ter value from the instruction. In the present example, 
the destination logical register value is LR 3 . Once the 
destination logical register value is extracted, control 
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unit 36 accesses the freetist RAM 62 to extract 1 10 the 
next available physical register identifier therefrom. As 
shown in Figure 3a, the head pointer is currently point- 
ing to physical register identifier PR 5 ; thus, PR5 is 
selected as the physical register to assign to the logical 5 
register LR 3 . Thereafter, control unit 36 increments 112 
the head counter 64 to cause the counter to point to the 
next available free physical register, which in the exam- 
ple is PR 6 . 

After a physical register identifier is retrieved from 10 
the freelist unit 34, control unit 36 applies 1 1 4 the logical 
register value LR 3 to the CAM 42. In effect, this opera- 
tion checks for the current physical register assignment 
for logical register LR 3 . In the present example, LR 3 is 
currently assigned to physical register PR 3 , as shown in 15 
the register file unit 30. Thus, when LR 3 is applied to the 
CAM 42, a hit will be found which will cause the physical 
register identifier PR 3 to be read out of the ROM 40. 
This physical register identifier PR 3 , along with the logi- 
cal register value LR 3 . is then stored into the reclaim so 
RAM 50 in the entry corresponding to the sequence 
number of the current instruction (sequence #0), as 
shown in Figure 3b. The information in reclaim RAM 50 
will be used, if necessary, to backstep the processor 10 
at a later time to the current state of the processor 1 0. 25 

After the logical register value LR 3 and its corre- 
sponding old physical register identifier PR 3 are stored 
into the reclaim RAM 50, control unit 36 clears the AV bit 
corresponding to the old physical register assignment, 
which in the example is the AV bit in AV RAM 46 corre- 30 
sponding to the physical register identifier PR 3 . Once 
that is done, control unit 36 loads 120 the logical regis- 
ter value LR 3 into the CAM 42 in the entry correspond- 
ing to the new physical register identifier, which is PR 5 . 
The logical register value LR 3 is thus assigned to a new 35 
physical register. Thereafter, the AV bit corresponding 
to physical register PR 5 is set 122 to indicate that this is 
now the current physical register assignment for logical 
register LR 3 . Note, however, that the data valid DV bit 
corresponding to P R 5 is not set. This is because no data 40 
has yet been written into the corresponding entry otthe, ' V 
data RAM 44 because the instruction has not yet been 
executed. The DV bit will be set oncejthe^h^ruction. is 
executed and appropriate data is^storedjntp the proper; 
entry of the data RAM 44. , , \ . ( -1' ,, 

Note from Figure 3b that at this point, the CAM : 4i2 " 
has two entries wherein logical register value LR 3 . is . 
stored. This would appear to. causQ confusion. HoWeyer, , . v * 
note that only the AV bit corresppridirig to the current " 
physical register assignment (PR5) is set, thereby incii- ' so 
eating that the entry corresponding to PR 5 is the current 1 
assignment. The AV bit corresponding to the old physi- . 
cal register (PR 3 ) is not set.. This manipulation of the AV 
bit forestalls any confusion that might arise due to mujtjk 
pie instances of the same logical register value. Q§ing 55 
the process thus far described, a destination logical reg> 
ister may be assigned to a new and different physical ^ 
register. ' ' 



The assignment of a new physical register to a des- 
tination logical register is only one of the functions per* 
formed by control unit 36. The other function is to 
retrieve the source data needed to execute the instruc- 
tion. Steps 124-130 of Figure 4 illustrate this retrieval 
process. In the present example, the source logical reg- 
isters are LR 0 and LR V Hence, in order to execute the 
instruction, data will need to be retrieved from the stor- 
age registers indicated by LRq and LR^ 

In carrying out the retrieval process, control unit 36 
first determines 124 whether any source logical regis- 
ters are indicated by the instruction. If source logical 
registers are indicated by the instruction, as is the case 
in the present example, then control unit 36 begins the 
retrieval process by applying 1 26 the source logical reg- 
isters LR 0 , LR-j to the CAM 42. Specifically, when LRq is 
applied to CAM 42, a hit is found in the entry of the CAM 
42 corresponding to physical register identifier PRq. 
Since the AV bit for this entry is set to H 1 this hit causes 
the physical register identifier PR 0 to be outputted from 
the ROM 40, which in turn, causes the corresponding 
data (Dato) to be outputted from the data RAM 44. The 
data from logical register LRq is thereafter sent 128 to 
the reservation stations 18. Since the data valid DV bit 
for this entry is set to "1", the data will be used by the 
reservation stations in executing the instruction. A simi- 
lar process takes place when LR-i is applied to the CAM 
42. Specifically, the application of LR 1 causes a hit to be 
found in the entry of the CAM 42 corresponding to phys- 
ical register identifier PR 1t This hit causes the physical 
register identifier PR-j to be outputted from ROM 40. 
This in turn causes the data (Data^ corresponding to 
PRi to be outputted from the data RAM 44 to the reser- 
vation stations. Data from logical register LR 1 is thus 
passed on to the reservation stations. Again, since the 
data valid bit corresponding to the outputted data is set. 
the reservation stations 18 will use the data in executing 
the instruction. 

After data from the source logical registers LRq, 
LRj are sent to the reservation stations 18, control unit 
36 further sends 130 the physical register identifier 
assigned to the destination logical register. In the 
present invention, the destination logical register is LR 3 
and the physical register assigned to LR 3 is PR 5 . 
Hence, in step 130, control unit 36 sends physical 
address identifier PR 5 to the reservation stations. After 
step 130, the . reservation stations 18 have all of the 
information needed, to . execute, the instruction. Thus, 
stations 18 schedule the instruction for execution, and 
at the appropriate time, sends the instruction along with 
the information discussed above to the execution unit 
20. In response, execution unit 20 executes the instruc- 
tion and writes the resulting data into an appropriate 
entry in the data RAM 44. Since the physical register 
identifier PR 5 was sent to the execution unit 20, the 
resulting data will be written into the entry of the data 
RAM 44 corresponding to the physical register identifier 
PR 5 , which is the correct entry. In addition, execution 
unit 20 preferably sets the data valid bit corresponding 
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to the entry to indicate that the data in the entry is now 
valid and can be used as source data. Thereafter, exe- 
cution of the instruction is complete. 

The process described above is what takes place 
when an instruction executes smoothly. However, as 
noted previously, two events may occur which may 
affect smooth execution. The first event is an acknowl- 
edgment that an incorrect branch was taken at a specu- 
lative branch instruction. The second event is the 
invocation of an execution exception which requires a 
trap handler to be executed. To remedy the first event, a 
backup procedure is implemented to restore the proces- 
sor 1 0 to the state it had just prior to the branch instruc- 
tion. To remedy the second event, a backstep procedure 
is implemented to restore the processor 10 to the state 
it had just prior to the instruction invoking the execution 
exception. In the event that an incorrect branch was 
taken, the sequencer 14 will issue a "backup" control 
signal to the control unit 36 of the register management 
unit 16. In the event of an execution exception, the exe- 
cution unit 20 will issue a "backstep" control signal to 
the control unit 36. In steps 132 and 136, control unit 36 
checks for these control signals. If one of these control 
signals is detected, then control unit 36 will take appro- 
priate action. 

To illustrate the backup procedure, suppose that a 
speculative branch instruction is issued and sent to the 
register management unit 16. In processing this instruc- 
tion, control unit 36 first determines 102 (Figure 4) 
whether the instruction is a speculative branch instruc- 
tion. H the instruction is a speculative branch instruction, 
as it is in the present example, then control unit 36 pro- 
ceeds to carry out steps 104 and 106 to create a 
"checkpoint". In step 1 04, the contents of the AV RAM 
46 are stored into one of the checkpoint RAM's 48 1( as 
shown in Figure 3c. In step 106, the contents of the 
head counter 64 in the freelist unit 34 are stored into 
one of the checkpoint registers 68^ . These two opera- 
tions preserve the state of the processor 10 prior to exe- 
cution of the branch instruction to create a reference 
state to which the processor may return. Once that is 
done, the instruction is processed in the same manner 
as other instructions. 

Suppose now that during the following instruction or 
several instructions thereafter it is discovered that a 
speculative branch was mis : priedicted and that a wrong 
branch was taken. In such a case, it will be necessary to 
restore the processor 1 0 to the state that it had just prior 
to the branch instruction. This restoration or backup is 
achieved as follows. First, the sequence 14 generates 
and sends a "backup" signal to the control unit 36. This 
"backup" signal is detected by control unit 36 in step 1 32 
and in response, control unit 36 implements the backup 
procedure shown in Figure 5. Preferably, control unit 36 
begins the backup procedure by overwriting 152! the 
contents of the AV RAM 46 with the contents of the 
checkpoint RAM 48^ (Figure 3c) which were stored in 
the chedqDoint RAM 48a during step 104. Further/con- 
trol unit 36 overwrites 154 the contents of the head 



counter 64 in the freelist unit 34 with the contents of the 
checkpoint register 68-| which were stored in the check- 
point register 68! in step 106. By carrying out these two 
steps, control unit 36 restores the processor 10 to the 

5 state that it had just prior to the branch. Processor 
backup is thus achieved. 

It is sometimes necessary to restore a machine 
state not only to a checkpointed location but to an 
instruction which lies between checkpoints. An example 

io of such a situation is one where an instruction encoun- 
ters a divide-by-zero condition. Since a divide-by-zero is 
not possible, such an instruction usually invokes an 
exception trap. Before accessing the trap handler, how- 
ever, it is first necessary to backstep the machine state 

15 to that which existed immediately after the instruction 
was executed. This "backstep" is usually difficult to 
achieve because there is no checkpoint created for the 
instruction since the instruction is not a branch instruc- 
tion. With the present invention, however, backstepping 

20 to an instruction can be performed easily. 

To illustrate this backstep procedure, reference will 
be made to Figs. 3b. 4, and 6, and to a specific example. 
To draw upon a previously used example, suppose 
again that the instruction 

25 

Instruction #0: LR 0 ± LR 1 --> LR 3 

is received by the control unit 36 of the register manage- 
ment unit 16. Suppose further again: (1) that the logical 

30 register value LR 3 is written into the CAM entry corre- 
sponding to the physical identifier entry PR 5 as shown 
in Figure 3b, thereby assigning LR 3 to physical register 
PR 5 ; and (2) that the logical register value LR 3 and its 
corresponding former physical register identifier PR 3 

35 are stored into the reclaim RAM 50 in the entry indexed 
by the instruction sequence #0. The writing of these val- 
ues into CAM 42 and reclaim RAM 50 were described 
previously with reference to steps 108-122 of Figure 4. 
Hence, these operations need not be re-described here. 

40 The important points to note here are: (1 ) that there are 
two instances of LR 3 stored in the CAM 42, one corre- 
sponding to the currently assigned physical register 
identifier PR 5 , and one corresponding to the formerly 
assigned physical register identifier PR 3 ; and (2) that 

45 the reclaim RAM 50 stores the former relationship 
between LR 3 and PR 3 . It is the information in reclaim 
RAM 50 which allows for easy backstepping. 

To illustrate how backstepping is achieved, suppose 
that the data corresponding to LR t is a zero. If such is 

so the case, then the above instruction would be a divide- 
by-zero operation. Upon executing this instruction, the 
execution unit 20 will issue a "backstep" control signal. 
This signal is detected by control unit 36 in step 136, 
and in response, control unit 36 carries out the backstep 

55 procedure 138 of the present invention: The backstep 
procedure is shown in greater detail in Figure 6. 

Preferably, control unit 36 begins the backstep pro- 
cedure by determining 162 the sequence number asso- 
ciated with the instruction which caused the execution 
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exception. In the present example, the sequence 
number associated with the instruction is 0. Once deter- 
mined, the instruction sequence number is used as an 
index to retrieve 164 from the reclaim RAM 50 the logi- 
cal register value LR 3 and its corresponding former 
physical register identifier PR3. Thereafter, control unit 
36 applies the logical register value LR 3 retrieved from 
the reclaim RAM 50 to the CAM 42 to determine 166 
which physical register is currently assigned to the logi- 
cal register LR 3 . As shown in Figure 3b, physical regis- 
ter PR 5 is currently assigned to LR 3 . Once the current 
assignment is found, control unit 36 clears 168 the AV 
bit corresponding to the currently assigned physical reg- 
ister. Hence, in the present example, the AV bit corre- 
sponding to PR 5 is cleared. Thereafter, control unit 36 
writes 170 the logical register value LR 3 into the CAM 
42 at an entry corresponding to the former physical 
address identifier. Hence, in the present example, LR 3 
is written into the CAM entry corresponding to the phys- 
ical register identifier PR 3 . Once that is done, control 
unit 36 sets 1 72 the AV bit corresponding to the former 
physical address identifier PR 3 . By carrying out the 
above steps, the current logical register/physical regis- 
ter relationship is erased and the former relationship is 
reinstated. As a final step, the head counter 64 in the 
freelist unit 34 is decremented 174 so that it once again 
points to PR 5 as it did before the above instruction was 
processed. Thus, the state of the processor 10 is 
restored or backstepped to the desired state. The trap 
handler may now be accessed. 

At this point, it should be noted that the backstep 
procedure is quite similar to the register assignment 
procedure discussed in steps 108-122 of Figure 4 The 
only difference is that instead of assigning the logical 
register value to a free physical register, the logical reg- 
ister value is assigned back to a physical register with 
which it had a previous relationship. The similarity 
between the two processes is significant because it 
means that the same hardware used to implement the 
register assignment process can be used to implement 
the backstep procedure. Thus, very little additional 
hardware (in fact, a simple additional multiplexer) is 
required to implement the backstep procedure. , 

The backstepping procedure has been described 
as backstepping over only one instruction. It should be 
noted, however, that the method and apparatus of the 
present invention may be used to backstep over any 
desired number of instructions. Typically, however, the 
number of instructions which may. be backstepped oyer 
is limited to the number of instructions which can be 
renamed per cycle. This number may vary from system 
to system. 

Referring again to Figure 4, after control unit 36 car- 
ries out steps 100-138, it makes a determination as to 
whether to send a certain physical register identifier 55 
back to the freelist unit to make the corresponding phys- 
ical register available again for reassignment. A physical 
register identifier will be sent or released to the freelist 
unit 34 if the instruction for which the physical register is 



used is committed. An instruction will be committed if 
the instruction successfully completed execution and if 
all previous instructions completed execution without 
error or exceptions. In step 140, control unit 36 makes a 
5 determination as to whether a certain instruction has 
been committed. If so, then control unit 36 releases the 
physical register identifier or identifiers associated with 
the instruction to the freelist unit 34. Continuing with the 
present example, suppose that the instruction 

10 

LR 0 4- LR 1 --> LR 3 

is committed. In such a case, the physical register iden- 
tifier PR 3 (stored in the reclaim RAM 50) corresponding 
15 to the destination register LR 3 is passed 142 to the 
freelist unit 34. More specifically, control unit 36 adds 
the physical register identifier PR 3 to the tail of the freel- 
ist RAM 62 and then increments the tail counter 66. The 
free register PR 3 is thus added to the list of physical reg- 
20 isters which may be assigned to a logical register 

The present invention has been described with ref- 
erence to specific examples. However, the invention 
should not be construed to be so limited. Various modi- 
fications may be made by one of ordinary skill in the art 
25 with the benefit of this disclosure without departing from 
the spirit of the invention. For example, for the sake of 
simplicity, the invention has been described with the 
assumption that only one instruction is issued per cycle. 
It should be noted, though, that the invention may be 
30 and is actually intended to be implemented in proces- 
sors which issue multiple instructions per dock cycle. To 
accommodate multiple instructions per cycle, more 
ports may be added to the register file unit 30, the reg- 
ister reclaim file unit 32, and the freelist unit 34. These 
35 and other modifications are within the spirit and contem- 
plation of the invention. Therefore, the present invention 
should not be limited by the examples used to illustrate 
it but only by the scope of the appended claims. 

40 Claims 

1. A method for executing multiple, sequential instruc- 
tions in parallel within a data processing device 
comprising the steps of: 
45 issuing a series of sequential instructions 

including a first and second instruction; 

identifying a first logical register with a phys- 
ical register in accordance with the first instruction; 
identifying a second logical register with the 
so physical register in accordance with the second 
instruction; 

executing the second instruction out of 
sequence with the first instruction, 

retaining device state information including 
an identification of the sequential ordering of the 
second instruction, the identification of the second 
logical register with the physical register, and the 
identification of the first logical register with the 
physical register; and. 
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restoring the device state information on the 
occurrence ol a data corrupting event. 

2. A method for issuing each of a sequence of instruc- 
tions utilized within a parallel processing device s 
comprising the steps of: 

identifying the sequential order of an instruc- 
tion; 

determining whether a logical register is ref- 
erenced by the instruction and if a logical register is 10 
referenced: 

associating the referenced logical register 
with a physical register, and 

determining whether the referenced logical 
register was associated with a prior physical regis- is 
ter from a previously issued instruction; and 

storing state information including: 

an identifier of the sequential ordering of the 
instruction, 

the association of the referenced logical reg- so 
ister with the physical register, if the instruction ref- 
erences a logical register, and 

the association of the referenced logical reg- 
ister with the prior physical register, if the logical 
register was previously associated with a prior 25 
physical register. 

3. The method as in Claim 2, wherein the storing step 
includes 

storing a rename identifier identifying a 30 
change of physical register association. 

4. A method for multiple instruction execution of a 
sequence of instructions within a parallel process- 
ing device comprising the steps of: 35 

issuing each of a sequence of instructions by 
identifying the sequential order of an instruc- 
tion; 

determining whether a logical register is ref- 
erenced by the instruction and if a logical register is 40 
referenced: 

associating the referenced logical register 
with a physical register, and 

determining whether the referenced logical 
register was associated with a prior physical regis- 45 
ter from a previously issued instruction; and 

storing state information including: 

an identifier of the sequential ordering of the 
instruction, 

the association of the referenced logical reg- so 
ister with the physical register, if the instruction ref- 
erences a logical register, and 

the association of the referenced logical reg- 
ister with the prior physical register, if the logical 
register was previously associated with a prior ss 
physical register; 

executing the instruction sequence by 

simultaneously selecting multiple instruc- 
tions. 



5. The method of Claim 4, including the step of: 

upon the occurrence of a data corrupting 
event, stopping execution of selected instructions; 
and 

returning the state of the processing device 
to a point in time prior to corrupting the data by 

backstepping the state in reverse sequence 
by utilizing the previously stored state information 
for each issued instruction executed after the point 
in time, and avoiding changing the state if the 
rename identifier does not identify a change of 
physical register association for a selected instruc- 
tion. 

6. The method of Claim 4, including the step of: 

reclaiming physical registers by 

monitoring execution completion of each 
instruction in the sequence; 

upon execution completion of an instruction 
in the sequence: 

determining whether any other instructions 
in the sequence which are earlier in sequential 
order have not completed execution, and 

if no earlier instructions have not completed 
execution, retrieving the previously stored associa- 
tion of the prior referenced physical register and 
designating availability of the prior referenced phys- 
ical register. 

7. A method for executing multiple, sequential instruc- 
tions in parallel within a data processing device 
comprising the steps of: 

issuing each of a sequence of instructions 
including a first and second instruction by; 

identifying the sequential order of an instruc- 
tion; 

determining whether a logical register is ref- 
erenced by the instruction and if a logical register is 
referenced: 

associating the referenced logical register 
with a physical register, and 

determining whether the referenced logical 
register was associated with a prior physical regis- 
ter from a previously issued instruction; and 

storing state information including: 

an identifier of the sequential ordering of the 
instruction, 

the association of the referenced logical reg- 
ister with the physical register, rf the instruction ref- 
erences a logical register, and 

the association of the referenced logical reg- 
ister with the prior physical register, if the logical 
register was previously associated with a prior 
physical register; 

identifying a first logical register with a phys- 
ical register in accordance with the first instruction; 

identifying a second logical register with the 
physical register in accordance with the second 
instruction; 
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executing the second instruction out of 
sequence with the first instruction; and, 

restoring the device state information on the 
occurrence of a data corrupting event. 
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(54) Reclamation of processor resources in a data processor 

(57) In a microprocessor, an apparatus is included 
for coordinating the use of physical registers in the 
microprocessor. Upon receiving an instruction, the 
coordination apparatus extracts source and destination 
logical registers from the instruction. For the destination 
logical register, the apparatus assigns a physical 
address to correspond to the logical register. In so 
doing, the apparatus stores the former relationship : > 
between the logical register and another physical regis- 
ter. Storing this former relationship allows the apparatus - 
to backstep to a particular instruction when an execu- ,;*. 
tion exception is encountered. Also, the apparatus 
checks the instruction to determine whether it is a spec- 
ulative branch instruction. If so, then the apparatus cre- 
ates a checkpoint by storing selected state information. 
This checkpoint provides a reference point to which the 
processor may later backup rl it is determined that .a 
speculated branch was incorrectly predicted. Overall, 
the apparatus coordinates the use of physical registers J 
in the processor in such a way that: (1) logical/physical 
register relationships are easily changeable; and (2) - 
backup and backstep procedures are accommodated. 
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